Replaceing the metadata parser from packaging.metadata#607
Replaceing the metadata parser from packaging.metadata#607LalatenduMohanty wants to merge 1 commit intopython-wheel-build:mainfrom
Conversation
8bd775c to
1247975
Compare
1247975 to
2132d61
Compare
2132d61 to
42ef5bf
Compare
cb3813b to
bdada13
Compare
|
@tiran @dhellmann PTAL |
dhellmann
left a comment
There was a problem hiding this comment.
This looks good to me. How many places in the code do we do something similar to parse metadata? How useful would it be to have a function that takes a path and returns the metadata?
|
|
My bad. I should have checked all code to see if same pattern exists else where. I can see https://github.com/python-wheel-build/fromager/blob/main/src/fromager/candidate.py#L82 . I do not think we need a common function yet. PTAL and let me know. |
Let me get back to you on this. |
6e992d6 to
a1a70e7
Compare
|
@tiran Since fromager only reads metadata for dependency resolution and doesn't need to write it back, round trip safety isn't necessary. The benefits of type safety and validation from packaging.metadata outweigh the loss of unknown fields that aren't being used anyway. |
a1a70e7 to
d948afc
Compare
src/fromager/bootstrapper.py
Outdated
| raw_metadata, _ = parse_email(f.read()) | ||
| metadata = Metadata.from_raw(raw_metadata) |
There was a problem hiding this comment.
Why are you using parse_email() + Metadata.from_raw() instead of Metadata.parse_email()? The Metadata.parse_email() combines parse_email(), Metadata.from_raw(), and additional validation.
This code should probably use fromager.dependencies.parse_metadata(metadata_filename).
7517935 to
93625a4
Compare
93625a4 to
a077307
Compare
|
@tiran PTAL when you have a chance |
18668f8 to
90afccf
Compare
90afccf to
56b0acd
Compare
5554a83 to
c8ee75b
Compare
b5ab037 to
7e8504b
Compare
src/fromager/dependencies.py
Outdated
| wheel_name_parts = wheel_filename.stem.split("-") | ||
| dist_name = wheel_name_parts[0] | ||
| dist_version = wheel_name_parts[1] | ||
| predicted_dist_info = f"{dist_name}-{dist_version}.dist-info/METADATA" |
There was a problem hiding this comment.
Let's not invent our own wheel parsing algorithm. The function add_extra_metadata_to_wheels has some code to get the dist-info directory of a wheel file. Perhaps move the code into a common, shared helper?
There was a problem hiding this comment.
Good catch. I am going to use parse_wheel_filename() i.e. the same underlying function from add_extra_metadata_to_wheels
src/fromager/bootstrapper.py
Outdated
| p = BytesParser() | ||
| metadata = p.parse(f, headersonly=True) | ||
| return Version(metadata["Version"]) | ||
| metadata = dependencies.parse_metadata(metadata_filename) |
There was a problem hiding this comment.
parse_metadata validates the metadata by default. This will raise an exception if the metadata version does not match the metadata content, e.g. license-file field in Metadata < 2.4.
There was a problem hiding this comment.
We should add validate=False here i.e. metadata = dependencies.parse_metadata(metadata_filename, validate=False)
Replaces the Python email library parser with packaging.metadata.Metadata for parsing wheel/package metadata. Fixes: python-wheel-build#561 Co-Authored-By: Claude <claude@anthropic.com> Signed-off-by: Lalatendu Mohanty <lmohanty@redhat.com>
7e8504b to
d28acd7
Compare
Replaces the Python email library parser with packaging.metadata.Metadata for parsing wheel/package metadata.
Fixes #561